371 research outputs found

    A Conserative Property of a Nested Relational Query Language

    Get PDF
    We proposed in [7] a nested relational calculus and a nested relational algebra based on structural recursion [6,5] and on monads [27,16]. In this report, we describe relative set abstraction as our third nested relational query language. This query language is similar to the well known list comprehension mechanism in functional programming languages such as Haskell [ll], Miranda [24], KRC [23], etc. This language is equivalent to our earlier query languages both in terms of semantics and in terms of equational theories. This strong sense of equivalence allows our three query languages to be freely combined into a nested relational query language that is robust and user-friendly

    Controlling False Positives in Association Rule Mining

    Get PDF
    Association rule mining is an important problem in the data mining area. It enumerates and tests a large number of rules on a dataset and outputs rules that satisfy user-specified constraints. Due to the large number of rules being tested, rules that do not represent real systematic effect in the data can satisfy the given constraints purely by random chance. Hence association rule mining often suffers from a high risk of false positive errors. There is a lack of comprehensive study on controlling false positives in association rule mining. In this paper, we adopt three multiple testing correction approaches---the direct adjustment approach, the permutation-based approach and the holdout approach---to control false positives in association rule mining, and conduct extensive experiments to study their performance. Our results show that (1) Numerous spurious rules are generated if no correction is made. (2) The three approaches can control false positives effectively. Among the three approaches, the permutation-based approach has the highest power of detecting real association rules, but it is very computationally expensive. We employ several techniques to reduce its cost effectively.Comment: VLDB201

    A Bounded Degree Property and Finite-Cofiniteness of Graph Queries

    Get PDF
    We provide new techniques for the analysis of the expressive power of query languages for nested collections. These languages may use set or bag semantics and may be further complicated by the presence of aggregate functions. We exhibit certain classes of graphics and prove that properties of these graphics that can be tested in such languages are either finite or cofinite. This result settles that conjectures of Grumbach, Milo, and Paredaens that parity test, transitive closure, and balanced binary tree test are not expressible in bah languages like BALG of Grumbach and Milo and BQL of Libkin and Wong. Moreover, it implies that many recursive queries, including simple ones like test for a chain, cannot be expressed in a nested relational language even when aggregate functions are available. In an attempt to generalize the finite-cofiniteness result, we study the bounded degree property which says that the number of distinct in- and out-degrees in the output of a graph query does not depend on the size of the input if the input is simple. We show that such a property implies a number of inexpressibility results in a uniform fashion. We then prove the bounded degree property for the nested relational language

    Relational Foundations For Functorial Data Migration

    Full text link
    We study the data transformation capabilities associated with schemas that are presented by directed multi-graphs and path equations. Unlike most approaches which treat graph-based schemas as abbreviations for relational schemas, we treat graph-based schemas as categories. A schema SS is a finitely-presented category, and the collection of all SS-instances forms a category, SS-inst. A functor FF between schemas SS and TT, which can be generated from a visual mapping between graphs, induces three adjoint data migration functors, ΣF:S\Sigma_F:S-instT\to T-inst, ΠF:S\Pi_F: S-inst T\to T-inst, and ΔF:T\Delta_F:T-inst S\to S-inst. We present an algebraic query language FQL based on these functors, prove that FQL is closed under composition, prove that FQL can be implemented with the select-project-product-union relational algebra (SPCU) extended with a key-generation operation, and prove that SPCU can be implemented with FQL

    Comparative analysis and assessment of M. tuberculosis H37Rv protein-protein interaction datasets

    Get PDF
    10.1186/1471-2164-12-S3-S2010th Int. Conference on Bioinformatics - 1st ISCB Asia Joint Conference 2011, InCoB 2011/ISCB-Asia 2011: Computational Biology - Proceedings from Asia Pacific Bioinformatics Network (APBioNet)12SUPPL.

    Methods for protein complex prediction and their contributions towards understanding the organization, function and dynamics of complexes

    Get PDF
    Complexes of physically interacting proteins constitute fundamental functional units responsible for driving biological processes within cells. A faithful reconstruction of the entire set of complexes is therefore essential to understand the functional organization of cells. In this review, we discuss the key contributions of computational methods developed till date (approximately between 2003 and 2015) for identifying complexes from the network of interacting proteins (PPI network). We evaluate in depth the performance of these methods on PPI datasets from yeast, and highlight challenges faced by these methods, in particular detection of sparse and small or sub- complexes and discerning of overlapping complexes. We describe methods for integrating diverse information including expression profiles and 3D structures of proteins with PPI networks to understand the dynamics of complex formation, for instance, of time-based assembly of complex subunits and formation of fuzzy complexes from intrinsically disordered proteins. Finally, we discuss methods for identifying dysfunctional complexes in human diseases, an application that is proving invaluable to understand disease mechanisms and to discover novel therapeutic targets. We hope this review aptly commemorates a decade of research on computational prediction of complexes and constitutes a valuable reference for further advancements in this exciting area.Comment: 1 Tabl

    A Performance Study of Three Disk-based Structures for Indexing and Querying Frequent Itemsets

    Get PDF
    Proceedings of the VLDB Endowment67505-51

    CAMBer: an approach to support comparative analysis of multiple bacterial strains

    Get PDF
    10.1109/BIBM.2010.5706549Proceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010121-12

    A Flexible Approach to Finding Representative Pattern Sets

    Get PDF
    10.1109/TKDE.2013.27IEEE Transactions on Knowledge and Data Engineerin
    corecore